Readability of Japanese Electronic Text with Phrase-based Line Breaking

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Readability and Word Distribution in Japanese

This paper reports the relation between text readability and word distribution in the Japanese language. There was no similar study in the past due to three major obstacles: (1) unclear definition of Japanese “word”, (2) no balanced corpus, and (3) no readability measure. Compilation of the Balanced Corpus of Contemporary Written Japanese (BCCWJ) and development of a readability predictor remov...

متن کامل

Automatic Assessment of Japanese Text Readability Based on a Textbook Corpus

Department of Electrical Engineering and Computer Science Graduate School of Engineering Nagoya University Chikusa-ku, Nagoya, 464-8603, JAPAN [email protected], {matuyosi,kondoh}@sslab.nuee.nagoya-u.ac.jp Abstract This paper describes a method of readability measurement of Japanese texts based on a newly compiled textbook corpus. The textbook corpus consists of 1,478 sample passages ex...

متن کامل

Assessing Text Readability Using Cognitively Based Indices

Many programs designed to compute the readability of texts are narrowly based on surface-level linguistic features and take too little account of the processes which a reader brings to the text. This study is an exploratory examination of the use of Coh-Metrix, a computational tool that measures cohesion and text difficulty at various levels of language, discourse, and conceptual analysis. It i...

متن کامل

Text Document Clustering based on Phrase

Affinity propagation (AP) was recently introduced as an unsupervised learning algorithm for exemplar based clustering. In this paper novel text document clustering algorithm has been developed based on vector space model, phrases and affinity propagation clustering algorithm. Proposed algorithm can be called Phrase affinity clustering (PAC). PAC first finds the phrase by ukkonen suffix tree con...

متن کامل

Phrase-Based Pattern Matching in Compressed Text

Byte codes are a practical alternative to the traditional bit-oriented compression approaches when large alphabets are being used, and trade away a small amount of compression effectiveness for a relatively large gain in decoding efficiency. Byte codes also have the advantage of being searchable using standard string matching techniques. Here we describe methods for searching in byte-coded comp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Japanese Society for Artificial Intelligence

سال: 2015

ISSN: 1346-0714,1346-8030

DOI: 10.1527/tjsai.30.479